NeXus Software Status 
Mark Konnecke, Uwe Filges 
Laboratory for Neutron Scattering 
Paul Scherrer Institute 
CH-5232 Villigen-PSI 
Switzerland 

Abstract 

NeXus is a joint effort of both the synchrotron and neutron scattering community 
to develop a common data exchange format based on HDF. In order to simplify access 
to NeXus files a NeXus-API is provided. This NeXus-API has been restructured and 
expanded to cover both HDF versions 4 and 5. Only small changes to the API were 
necessary in order to accomplish this. A foundation was laid to extend the NeXus-API 
to further file formats. A new NeXus-API for IDL based on the IDL C interface has been 
implemented. Thus both HDF-4 and HDF-5 NeXus files can be processed from IDL. 
The time-of-flight data analysis program IDA has been adapted to support NeXus. The 
neutron ray tracing package McStas from Risoe has been updated to write NeXus files 
through the NXdict-API. 

1 Introduction 

NeXus 1 aspires to become a common data format for both the synchrotron and neutron 
scattering community. The aim of the NeXus proposal is to provide an efficient and 
platform independent self describing data exchange format. The NeXus proposal has five 
levels: 

• A physical file format. 

• A Application Programmer Interface (API) for accessing data files. 

• A file structure. 

• Rules for storing single data items. 

• A dictionary of parameter names. 

As physical file format the hierachical data format (HDF) 2 from the National Cen- 
ter for Super computing Applications (NCSA) was choosen. This is a binary, platform 
independent, self describing data format. HDF is well supported by major data analysis 
packages. HDF files are accessed through a library of access functions. This library is 
public domain software. Libraries are available for the ANSI-C, Fortran77 and Java pro- 
gramming languages on all important computing platforms. The HDF-4 library supports 
a lot of different content types such as scientific datasets, groups, images, tables etc. Of 
these NeXus only uses the scientfic dataset and the group structures. HDF-4 groups allow 
to order information hierarchically in a HDF file, much like directories in a filesystem. As 
the HDF library is also fairly complex a NeXus-API was defined which facilitates access 
to NeXus HDF files. 

The file structure part of the NeXus definition provides application programs with the 
information where to find certain data elements in the file. NeXus files are structured into 



several groups, much like directories in a file system. The NeXus file structure provides 
for multiple datasets in a single file and easy retrieval of plottable information. NeXus 
encourages users to include all necessary information about an experiment in one file and 
not to distribute such information across multiple file. Therefore the NeXus file structure 
provides for a complete description of an instrument used for an experiment. 

The rules for storing individual data items in a NeXus file provide the infrastruture for 
locating the axises of multi-dimensional datasets and require the user to store the units 
of measurement with each data item. 

The NeXus dictionary, the least developed area of the NeXus standard, provides names 
for data items in a NeXus files and files structures for known instrument types. More 
information about NeXus can be found at the NeXus WWW-sites: 



|http: / /www. neutron. anl.gov/ nexus] and [http:/ /InsOO.psi.ch/NeXui . 



2 A New NeXus- API 

The original NeXus-API as described above had been developed to support the then 
prevalent HDF version 4.1 (HDF-4). Over time the HDF-4 library had become overly 
complex and also imposed certain limitations on users. Therefore the NCSA decided to 
redesign HDF. This brought a new version of HDF, HDF version 5, into existence which 
has a different access library and a different, incompatible file format. HDF version 5 
(HDF-5) maintained all the advantages of the HDF version 4 format. But a couple of 
important limitations of HDF-4 were levied: 

• HDF-4 limits the number of objects in a HDF file to 20000. This sounds much, but 
isn't because most HDF-4 content types consist of multiple objects. 

• File size was restricted to 2 GB. 

• HDF-4 is not thread safe. 

Some of these limitations of HDF-4 were already hit by NeXus users. Though NCSA 
commited itself to maintaining HDF-4 for some time to come, the need was felt to move 
to HDF-5 as the physical file format for NeXus. 

Therefore the NeXus design team decided to program a new API with following design 
goals: 

• Support for both HDF-4 and HDF-5. 

• API compatibility to the old version. 

In order to achieve this goal a framework was developed for adding further file formats, 
for example XML, to the NeXus-API. 

After some trouble with the HDF-5 libraries the new NeXus-API version 2.0 was 
released. This version can be built to support either HDF-4, HDF-5 or both. The goal 
of API compatibility was achieved with three exceptions: 

• Due to technical reasons, the mechanism for the creation of compressed datasets 
had to change. 
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• When creating a new file a user has to select if a HDF-4 or HDF-5 is to be created. 

• In HDF-4 groups are ordered in the sequence of their creation in the data file. In 
HDF-5 an alphabetical order is imposed. Thus the order of groups in HDF-4 and 
HDF-5 NeXus files is not identical. The HDF-5 team at NCSA promised to do 
something about this. 

A little problem was posed by NeXus classes. NeXus uses the class name of a group 
and its name for identification. In HDF-5 the concept of a group class was abandoned. 
However, HDF-5 brought us arbitrary attributes to groups. The problem was thus solved 
by storing the NeXus class in a group attribute with the name NXclass. So, the news is 
that there is no news! The NeXus-API stayed the same. The new API is stable and is 
already in production use at the SINQ-instruments TRICS and AMOR at PSI. 
The new NeXus-API is available for the programming languages: 

• ANSI-C and C++ 

• Fortran 77 

• Fortran 90 

• Java 

It has been tested on True64Unix 5.1, Linux, Mac-OS 10 and Windows operating systems 
so far. 

It is now recommended to use the newer HDF-5 file format wherever possible. Con- 
versions from HDF-4 to HDF-5 and back can be performed with a set of utilities provided 
by NCSA. This conversion does not produce valid NeXus files though. The NCSA utility 
for converting from HDF-4 to HDF-5, h4toh5, was made NeXus compliant with a little 
patch. This utility also converts links properly. The corresponding utility h5toh4, for con- 
version from HDF-5 to HDF-4 files, can not be that easily patched to work with NeXus 
because h5toh4 changes the file structure. There also exists a Java utility for conversion 
from HDF-4 NeXus files to HDF-5 NeXus files and back. This utility cannot be used in 
production use, though, because it duplicates linked datasets. 

3 A New NeXus IDL API 

There has always been a NeXus-API for RSI's Interactive Data Language (IDL) data 
treatment program. This NeXus-API for IDL (NIDL) was based on the HDF-4 access 
functions provided by IDL. As HDF-5 support is not yet available for IDL, this approach 
was not feasible for the new NeXus-API. Moreover, a reimplementation using IDL HDF- 
5 access functions would have meant a complete reimplementation and duplicated code. 
Therefore it was choosen to integrate the new NeXus-API into IDL through IDL's native 
function ANSI-C interface. This would also save work in the case of further extensions 
to the NeXus file format list. This approach, a IDL NeXus API through IDL's native 
functions, has been implemented. It currently is known to work well under True64Unix 
version 5.1. As IDL standardizes many aspects of the native function interface ports to 
other unix like operating system should not be a big deal, a port to Microsoft Windows 
type operating systems may require only a little more work. 
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4 New NeXus Aware Software 



4.1 Browsers and Utilities 

Browsers allow to view the content of a NeXus file and sometimes even edit it. The 
simplest of such applications is the command line browser NXbrowse which is distributed 
with the NeXus package. It can browse both NeXus HDF-4 and NeXus HDF-5 files. 

Very new is the HDFView 3 program from NCSA. This is a merged version of the older 
jhv and h5view applications from NCSA. HDFview can display the hierarchy of a NeXus 
file and give some basic renditions of data. HDFview also allows for editing of HDF files. 
HDFview works with both HDF-4 and HDF-5. HDFview is written in Java and thus 
available for a variety of platforms. 

There is also a new NeXus Explorer program written in Visual Basic for Windows 
by Albert Chan at ISIS. Besides browsing, this application can also edit the NeXus file. 
With the help of IDL the application is able to do 2D plots of data. The NeXus Explorer 
is only available for the Windows platform from 

|hTtp:/ /www. isis.rl.ac.uk/geniebinaries/NexusExplorerBits.zip 

A couple of new NeXus utilities have been developed and been included into the 
NeXus-API package: 

NXtree Manuel Lujan from APNS provided NXtree which displays the structure of a 
NeXus file as a tree. Output can be in text, html or latex format. 

NXtoXML is an experimental utility which converts a NeXus binary file into a XML 
file. The xml format has still to be finalised through. 

NXtoDTD is another experimental program which documents the structure of a NeXus 
program as a XML data Type definition (DTD). This tool shall help in the process 
of defining the NeXus dictionary and instrument type specifications. Both XML 
utilities were contributed by Ray Osborn, APNS. 

4.2 Data Analysis Programs 

4.2.1 General Data Analysis Programs 

Besides the known NeXus supporting data analysis tools IDL, opengenie and Lamp there 
is now a new system named openDave brought to us by the FRM-2 reactor group in 
munich. OpenDave has a modulare architecture which allows to process data from a 
selection of sources through various filters and output them to various types of sinks, 
including graphics. OpenDave is written in C++ with the QT toolkit and is thus restricted 
to operating systems supported by QT. Unfortunately, QT requires a developers license 
on a lot of platforms 

4.2.2 TOF Data Analysis Programs 

Besides the programs inx, nathan and Isaw, the IDA program from Andreas Meyer, TU 
munich was adapted to support NeXus files as produced by the FOCUS instrument at 
PSI. 
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4.2.3 Filters to Other Packages 

In this section niters to and from NeXus to other data formats are discussed. For the 
small angle scattering community two such tools exist: 

psitohmi converts PSI-SANS NeXus files to a format suitable for the SANS data pro- 
cessing suite from the Hahn Meitner Institute in berlin. This utility was provided 
by Joachim Kohlbrecher, PSI. 

nx2ill Ron Gosh wrote nx2ill in order to convert SANS NeXus files to the format under- 
stood by the ILL SANS data processing tools. 

There exists also a small utility which combines powder diffraction NeXus data files 
into a large powder diagram which is then stored in a format suitable for the Rietveld 
program fullprof. 

4.2.4 Single Crystal Diffractometer Data Analysis 

At PSI a new program for the integration of single crystal diffraction data collected with a 
PSD with the name anatric was written. This program is optimised for the SINQ single 
crystal diffractometer TRICS. TRICS is a four circle diffractometer with a conventional 
eulerian cradle and three position sensitive detectors positioned on a movable detector 
arm at 0, 45 and 90 degrees offset. Typical measurements involve omega scans across a 
given omega range with all other angles fixed. TRICS saves its data in HDF-5 NeXus 
files. 

Anatric is able to perform two operations: 

Reflection location anatric locates reflections in the data without prior knowledge 
about the crystal through a local maximum search. Reflection positions are de- 
termined through a center of gravity calculation. The output of this step is a list of 
reflections to be used for indexing or UB matrix refinement. 

Reflection integration anatric can integrate reflection intensities for further processing 
with a crystal structure data determination package. 

Anatric has been designed to cope with two TRICS specific problems. The first is that 
measurements are performed at low and at high two theta at the same time. This together 
with the resolution of the instrument has the consequence, that reflection positions shift 
by up to 2.5 pixels between omega frames at the high two theta detector. Moreover at 
TRICS reflections show up as rather large features on the detector. These two factors 
combined would make rectangular reflection boxes for integration excessively large and 
thus impractical. Anatric now takes this shift into account and extracts data to integrate 
along an arbitrary axis through the reflection. This axis is determined experimentally 
from strong reflections in the reflection location step. 

The second problem is the large size of the reflections at TRICS in relation to the size 
of the position sensitive detectors ( 20x20cm). A reflection box for integration must be 
large enough to enclose strong reflections completely. Using such a large reflection box for 
all reflections however would kill off many perfectly well measured smaller reflections at 
the border of the detectors due to the necessary border tests. In order to cope with this, 
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anatric determines for each reflection the size of the necessary integration box individually. 
If the reflection is to weak for this to work, a minimum integration box is used. 

The actual integration of intensities is then performed with a variation of the dynamic 
mask procedure as described by Sjolin and Wlodawer 5 . Anatric has been written in C++. 
It is more or less finished, however the program still needs to be verified against a full 
structure refinement. 

4.3 Monte Carlo Simulation Packages 

Monte Carlo simulations of instruments become an increasingly valuable tool for instru- 
ment design, instrument optimisation and measurement planning. The NeXus file format 
can offer full documentation of the simulation run and efficient transfer of large simulated 
datasets to such packages. Moreover it could also become possible to simulate data files 
for testing of data analysis programs. Currently only one Monte Carlo simulation pro- 
gram, McStas 6 , has been ported to use NeXus for data file storage by Emmanuel Farhi 
at ILL. As of now, only a XML format is supported but support for binary NeXus files 
will follow soon. Such binary NeXus files will be written through the NeXus dictionary 
API. This allows the user to customize the NeXus file structure to his demands. NeXus 
support for the simulation package Vitess 7 is in early stages. 

5 Future Developments of NeXus 

Most effort in developing the NeXus standard should now go into the refinement of the 
dictionary and the definition of NeXus file structures and content for a couple of different 
instrument types. It would also be good to set up a kind of scheme for standardizing such 
definitions. 

The other thing which needs to be done is to make the NeXus installation and linking 
easiser. Feedback from users show that many have difficulties when juggling with all the 
different libraries, NeXus, HDF-4 and HDF-5, when compiling and linking programs. 
Possible solutions could be autoconf scripts for platforms which support this and perhaps 
the automatic generation of a shell script for linking against NeXus. 

One main objection against NeXus is: it is not ASCII, I cannot edit my data! A 
possible answer to this objection would be a XML NeXus format. XML means extended 
Markup Language and is a scheme for defining a markup language in ASCII. The best 
known example of a markup language is html which is driving the WWW. A XML NeXus 
format is in the process of being defined. It is suggested to extend the NeXus-API to 
support XML as well. 

It would also be helpful to provide NeXus support for some more general data analysis 
packages, especially free ones such as octave and scilab. 

6 Conclusion 

With the inclusion of the new HDF-5 file format into the NeXus-API, this API is well 
braced for the future. More and more NeXus aware data analysis software is being written 
and the software already available covers a wide area of applications. 
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